Cross-concordances: terminology mapping and its effectiveness for information retrieval
نویسندگان
چکیده
The German Federal Ministry for Education and Research funded a major terminology mapping initiative at the GESIS Social Science Information Centre (GESIS-IZ), which found its conclusion in 2007. The task of this terminology mapping initiative was to organize, create and manage ‘cross-concordances’ between controlled vocabularies (thesauri, classification systems, subject heading lists) centred around the social sciences but quickly extending to other subject areas. Cross-concordances are intellectually (manually) created crosswalks that determine equivalence, hierarchy, and association relations between terms from two controlled vocabularies. Most vocabularies in our project have been related bilaterally. In a bilateral cross-concordance terms are mapped from vocabulary A to vocabulary B as well as relating terms from vocabulary B to vocabulary A. To date, 25 controlled vocabularies from 11 disciplines (see Figure 1) and 3 languages (German, English and Russian) have been connected with vocabulary sizes ranging from 1,000 – 17,000 terms per vocabulary. More than 513,000 relations were generated in 64 crosswalks.
منابع مشابه
Building a Terminology Network for Search: The KoMoHe Project
The paper reports about results on the GESIS-IZ project “Competence Center Modeling and Treatment of Semantic Heterogeneity” (KoMoHe). KoMoHe supervised a terminology mapping effort, in which ‘cross-concordances’ between major controlled vocabularies were organized, created and managed. In this paper we describe the establishment and implementation of crossconcordances for search in a digital l...
متن کاملTreatment of Semantic Heterogeneity in Information Retrieval
The first step to handle semantic heterogeneity should be the attempt to enrich the semantic information about documents, i.e. to fill up the gaps in the documents meta-data automatically. Section 2 describes a set of cascading deductive and heuristic extraction rules, which were developed in the project CARMEN for the domain of Social Sciences. The mapping between different terminologies can b...
متن کاملBilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval
The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, phrasal translation as well as evaluations on Cross-Language Information Retrieval. A two-stages translation model is proposed for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives according to their...
متن کاملThe European Thesaurus on International Relations and Area Studies - a Multilingual Resource for Indexing, Retrieval, and Translation
The multilingual European Thesaurus on International Relations and Area Studies (European Thesaurus) is a special subject thesaurus for the field of international affairs. It is intended for use in libraries and documentation centres of academic institutions and international organizations. The European Thesaurus was established in a collaborative project involving a number of leading European ...
متن کاملبررسی تطبیقی اصطلاحنامه معارف اسلامی و علوم قرآنی
This study examines the comparative strengths and weaknesses of the thesaurus and thesaurus Quranic teachings of the Koran. In today's society where the documents are kept electronically, retrieval and dissemination of information for the development of research, much greater importance of saving documents and thesaurus that is the basis for indexing in various sciences, One of the solutions fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/0806.3765 شماره
صفحات -
تاریخ انتشار 2008